AITopics | proximity data

Collaborating Authors

proximity data

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Multidimensional scaling of two-mode three-way asymmetric dissimilarities: finding archetypal profiles and clustering

Alcacer, Aleix, Benitez, Rafael, Bolos, Vicente J., Epifanio, Irene

arXiv.org Machine LearningNov-21-2025

Multidimensional scaling visualizes dissimilarities among objects and reduces data dimensionality. While many methods address symmetric proximity data, asymmetric and especially three-way proximity data (capturing relationships across multiple occasions) remain underexplored. Recent developments, such as the h-plot, enable the analysis of asymmetric and non-reflexive relationships by embedding dissimilarities in a Euclidean space, allowing further techniques like archetypoid analysis to identify representative extreme profiles. However, no existing methods extract archetypal profiles from three-way asymmetric proximity data. This work extends the h-plot methodology to three-way proximity data under both symmetric and asymmetric, conditional and unconditional frameworks. The proposed approach offers several advantages: intuitive interpretability through a unified Euclidean representation; an explicit, eigenvector-based analytical solution free from local minima; scale invariance under linear transformations; computational efficiency for large matrices; and a straightforward goodness-of-fit evaluation. Furthermore, it enables the identification of archetypal profiles and clustering structures for three-way asymmetric proximities. Its performance is compared with existing models for multidimensional scaling and clustering, and illustrated through a financial application. All data and code are provided to facilitate reproducibility.

artificial intelligence, dissimilarity, machine learning, (16 more...)

arXiv.org Machine Learning

2511.15813

Country:

Europe > Spain (0.04)
Asia > Philippines (0.04)
Asia > Malaysia (0.04)
(21 more...)

Genre: Research Report (1.00)

Industry:

Energy (1.00)
Banking & Finance > Trading (1.00)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Out-of-Sample Embedding with Proximity Data: Projection versus Restricted Reconstruction

Trosset, Michael W., Tan, Kaiyi, Tang, Minh, Priebe, Carey E.

arXiv.org Machine LearningMay-13-2025

The problem of using proximity (similarity or dissimilarity) data for the purpose of "adding a point to a vector diagram" was first studied by J.C. Gower in 1968. Since then, a number of methods -- mostly kernel methods -- have been proposed for solving what has come to be called the problem of *out-of-sample embedding*. We survey the various kernel methods that we have encountered and show that each can be derived from one or the other of two competing strategies: *projection* or *restricted reconstruction*. Projection can be analogized to a well-known formula for adding a point to a principal component analysis. Restricted reconstruction poses a different challenge: how to best approximate redoing the entire multivariate analysis while holding fixed the vector diagram that was previously obtained. This strategy results in a nonlinear optimization problem that can be simplified to a unidimensional search. Various circumstances may warrant either projection or restricted reconstruction.

artificial intelligence, machine learning, representation space, (18 more...)

arXiv.org Machine Learning

2505.06756

Country:

North America > United States > Indiana (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Oceania > New Zealand (0.04)
(3 more...)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Complex-valued embeddings of generic proximity data

Münch, Maximilian, Straat, Michiel, Biehl, Michael, Schleif, Frank-Michael

arXiv.org Machine LearningAug-31-2020

Proximities are at the heart of almost all machine learning methods. If the input data are given as numerical vectors of equal lengths, euclidean distance, or a Hilbertian inner product is frequently used in modeling algorithms. In a more generic view, objects are compared by a (symmetric) similarity or dissimilarity measure, which may not obey particular mathematical properties. This renders many machine learning methods invalid, leading to convergence problems and the loss of guarantees, like generalization bounds. In many cases, the preferred dissimilarity measure is not metric, like the earth mover distance, or the similarity measure may not be a simple inner product in a Hilbert space but in its generalization a Krein space. If the input data are non-vectorial, like text sequences, proximity-based learning is used or ngram embedding techniques can be applied. Standard embeddings lead to the desired fixed-length vector encoding, but are costly and have substantial limitations in preserving the original data's full information. As an information preserving alternative, we propose a complex-valued vector embedding of proximity data. This allows suitable machine learning algorithms to use these fixed-length, complex-valued vectors for further processing. The complex-valued data can serve as an input to complex-valued machine learning algorithms. In particular, we address supervised learning and use extensions of prototype-based learning. The proposed approach is evaluated on a variety of standard benchmarks and shows strong performance compared to traditional techniques in processing non-metric or non-psd proximity data.

artificial intelligence, machine learning, matrix, (16 more...)

arXiv.org Machine Learning

2008.13454

Country:

North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > Germany > Bavaria > Lower Franconia > Würzburg (0.04)
(8 more...)

Genre: Research Report (0.64)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Classification on Pairwise Proximity Data

Graepel, Thore, Herbrich, Ralf, Bollmann-Sdorra, Peter, Obermayer, Klaus

Neural Information Processing SystemsDec-31-1999

We investigate the problem of learning a classification task on data represented in terms of their pairwise proximities. This representation does not refer to an explicit feature representation of the data items and is thus more general than the standard approach of using Euclidean feature vectors, from which pairwise proximities can always be calculated. Our first approach is based on a combined linear embedding and classification procedure resulting in an extension of the Optimal Hyperplane algorithm to pseudo-Euclidean data. As an alternative we present another approach based on a linear threshold model in the proximity values themselves, which is optimized using Structural Risk Minimization. We show that prior knowledge about the problem can be incorporated by the choice of distance measures and examine different metrics W.r.t.

algorithm, data item, proximity, (13 more...)

Neural Information Processing Systems

Country:

North America > United States > New York (0.04)
Europe > Germany > Berlin (0.04)
Europe > Germany > Baden-Württemberg > Karlsruhe Region > Heidelberg (0.04)

Industry: Health & Medicine (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.50)

Add feedback

Classification on Pairwise Proximity Data

Graepel, Thore, Herbrich, Ralf, Bollmann-Sdorra, Peter, Obermayer, Klaus

Neural Information Processing SystemsDec-31-1999

We investigate the problem of learning a classification task on data represented in terms of their pairwise proximities. This representation doesnot refer to an explicit feature representation of the data items and is thus more general than the standard approach of using Euclideanfeature vectors, from which pairwise proximities can always be calculated. Our first approach is based on a combined linear embedding and classification procedure resulting in an extension ofthe Optimal Hyperplane algorithm to pseudo-Euclidean data. As an alternative we present another approach based on a linear threshold model in the proximity values themselves, which is optimized using Structural Risk Minimization. We show that prior knowledge about the problem can be incorporated by the choice of distance measures and examine different metrics W.r.t.

artificial intelligence, data item, machine learning, (16 more...)

Neural Information Processing Systems

Country: Europe > Germany (0.29)

Industry: Health & Medicine (0.69)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Active Data Clustering

Hofmann, Thomas, Buhmann, Joachim M.

Neural Information Processing SystemsDec-31-1998

Active data clustering is a novel technique for clustering of proximity data which utilizes principles from sequential experiment design in order to interleave data generation and data analysis. The proposed active data sampling strategy is based on the expected value of information, a concept rooting in statistical decision theory. This is considered to be an important step towards the analysis of largescale data sets, because it offers a way to overcome the inherent data sparseness of proximity data.

dissimilarity, experiment, proximity data, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
Europe > Germany > North Rhine-Westphalia > Cologne Region > Bonn (0.04)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Add feedback

Active Data Clustering

Hofmann, Thomas, Buhmann, Joachim M.

Neural Information Processing SystemsDec-31-1998

dissimilarity, experiment, proximity data, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
Europe > Germany > North Rhine-Westphalia > Cologne Region > Bonn (0.04)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Add feedback

Active Data Clustering

Hofmann, Thomas, Buhmann, Joachim M.

Neural Information Processing SystemsDec-31-1998

Active data clustering is a novel technique for clustering of proximity datawhich utilizes principles from sequential experiment design in order to interleave data generation and data analysis. The proposed activedata sampling strategy is based on the expected value of information, a concept rooting in statistical decision theory. This is considered to be an important step towards the analysis of largescale datasets, because it offers a way to overcome the inherent data sparseness of proximity data.

dissimilarity, experiment, proximity data, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
Europe > Germany > North Rhine-Westphalia > Cologne Region > Bonn (0.04)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Add feedback